Model Selection

OCR-free document understanding

# OCR-free document understanding

Donut Base Encoder

Donut is an OCR-free document understanding Transformer model that directly processes document images through a visual encoder

Text Recognition

OCR DocVQA Donut

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder for document visual question answering tasks.

Donut is an OCR-free document understanding model based on Swin Transformer visual encoder and BART text decoder, this version is fine-tuned on CORD receipt dataset

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase